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This Is the first section in a proposed monograph on algorithmic com¬ 
plexity theory. Future sections shall Include: Information Theory as a 
Proof Technique; Algorithms Using Linear Form Inequalities; Some Probabilistic 
Analyses of Algorithms* etc. Comments* Suggestions* and corrections are 
welcomed. Please let me know what you think. 

This is not a limited distribution, document* although I may wish to 
publish it later. Anyone who develops an idea based on this work to a more 
advanced state is welcome to publish first. I would be very eager to see 
any such result as soon as possible. 



SECTION I: A FORMULATION OF BASIC CONCEPTS 


ll Definition of the Pro bleu 

Many readers will recall the advent of the "fast Fourier Series - paper 
of Cooley and Tukey [12]* This paper described an algorithm to evaluate the 
complex Fourier series 

x(J) - £ A(k) - W JK * J * Oil.ft-l, 

k*0 

where U ■ e^ ir ^ n 1 , The atgorithm required only cn log n complex additions and 
multiplications to perform this evaluation, where previous methods had used 
cn 2 such operations. The immediate result of this discovery was a dramatic 
improvement in performance of calculations which were basic in a wide spec¬ 
trum of applications; programs which had used hours of computer time every 
day could now be rewritten using the Cooley-Tukey algorithm tq take on iy 
minutes! 

This improvement of an old and long accepted algorithm dramatizes the 
need for a systematic study of the following problem: 

THE PROBLEM OF ALGORITHMIC COMPLEXITY Given an "important" task which may 
be performed by some set of algorithms imp!ementable on a "standard computer"■ 
what is the "minimum time" needed to perform this task using any such 
algorithm* 


The concept of the "importance" of a task shall be left Intuitive* but 

■•m 

the reader may substitute the phrase "occurring frequently in applications", 
as for example the ordering of a sequence of numbers or the multiplication of 
two matrices. We shall spend a great deal of time in this section clarifying 
what we mean by "standard computer* 1 and "time taken by an algorithm". 

Our eventual aim of course is to find the best algorithm to perform each 










task; hut to prove that an algorithm is best for a task we must solve the 
Problem of Algorithmic Complexity. As has been pointed out by a number of 
researchers, notably by Jack Schwartz> there are other measures of complexity 
for a task: the length of the shortest program for an algorithm that per- 
forms the task, the minimal size of storage used by any algorithm perforating 
the task, etc. With no intent to ignore these other measures of complexity 
{Indeed we cannot, see 52, Example 1, this section}* we shall focus our 
attention on the time performance of an algorithm. 

In the last few years a great deal of research has been done on problems 
of algorithmic complexity. At the same time* many sophisticated computer 
scientists have expressed perplexity (or lodged objections) concerning both 
the aims and the basic assumptions of this field of inquiry. This is probably 
due to the fact that algorithmic complexity is such a new field that little 
has been written on it and misconceptions tend to arise In such circumstances. 

In this section we shall attempt to indicate the scope of the field and 
scrutinize the assumptions of commonly employed techniques of proof, Ue 
undertake this to answer some of the objections we have mentioned, but mere 
fmpertantly to clarify the basic assumptions upon which any useful theory 
must be founded. 



§2 The *Standard" Computer 

Since we are attempting to present the groundwork for a theory which will 
find wide applications, we would like our model for a "standard 11, computer to 
reflect the structure of the general purpose computer* Thus we stipulate 
that information shall be stored as sequences of zeroes and ones (and thereby 
exclude analog computers from consideration). We also require that the in¬ 
formation stored should be directly accessible; any bit ef information stored 
in memory may be retrieved in a length of time which is constant, irrespective 
of its "position 1 ' with respect to our previous access. 

8y imposing this requirement, of course, we eliminate the entire spectrum 
of tape automata as subjects of consideration. There are drawbacks to this 
step. In particular we exclude Turing machines* which are capable of per¬ 
forming- any task for which an effective procedure has been conceived,, and 
are attractive to the theorist because of their simplicity of concept and 
firm basis In rigor. We cannot hope to achieve comparable standards of 
rigor In dealing with general purpose computers for years to come. Furthermore, 
progress has already been made in studying the complexity of algorithms, using 
Turing machines to good effect. We shall consider this latter point further 
when we discuss the field of computational complexity* For the present, we 
limit ourselves to a few remarks in defense of the assumption that "standard 1 ' 
computers should have direct access memory* 

The paramount consideration here is that general purpose computers do 
have direct access memory; our desire to reflect this fact in our inode 1 springs 
from our purpose of developing a theory with wide applications. Turing 
machines were developed historically to investigate the question of what 
problems were effectively decidable; the time which was spent In working out 



an algorithm was never a cansideration. Now It Is possible, because of tbe 
elemental nature of the Universal Turing machine, that one might be able 
to calculate a measure of implicit complexity of a task which bore a one to 
one correspondence to the minimal time needed to perform the task cm any 
conceivable machine. Naturally this calculation would only be possible for 
tasks known to be computable, or we would have solved the halting problem. 

Even if we postulate the computability of such an implicit complexity [which 
Is far removed from present day capabilities) tbe following problem still 
faces us: what is tbe one to one correspondence which will enable us to 
deduce tbe minimal time required to perform the task on a general purpose 
computer? One cannot get out of a theory what has not been put into it1 fi, 
general purpose computer has direct access memory, and time considerations are 
highly sensitive to this fact. Hence, a model such as the standard computer 
with direct access memory must be developed and studied. 

Returning to our description of the standard computer, we stipulate that 
memory is segmented Into contiguous strings of bits, Each of standard length* 
which shall be called '"words". He make no assumption concerning the length 
of a word except to indicate that it is quite small in relation to tbe number 
qf bits in memory (an acceptable extreme case is that of words consisting of 
single bits)* The words are said to have “locations" in memory and are tbe 
basic elements on which the machine acts; indeed it Is the words which we 
may assume are directly addressable, rather than the bits themselves. 

The computer acts on the words by performing "elementary operations". 
These are the usual machine language operations and include: zeroirvg a word; 
moving a word from one location to another; performing logical operations on 
two words such as "and"* "or"* "nor"* etc,; performing arithmetic operations 
on two words such as "add"* "Subtract", "multiply", "divide", etc.; comparison 


of two words with 4 branch In the control of the '"program 11 of the computer 
depending on the outcome of the comparison* This is not a complete list* but 
it may be filled in by any reader with machine language experience. One word 
gf warning, however: single instructions which act on a "field" of words 
such as moving an arbitrarily large contiguous block of memory fro* one 
location to another can not be thought of as elementary operations* The reason 
for this is that fn the present early stage of development of algorithmic 
complexity, the time taken by an algorithm is often estimated by counting the 
number of elementary operations the algorithm performs. The elementary opera¬ 
tions are assumed to take about the same length of time, up to a reasonable 
constant factor of multiplication. (This crude assumption is not always used: 
see $. Uinograd T [1].) An instruction which acts on a field of words is really 
a machine implemented macro-instruction which may take an arbitrarily greater 
time to apply than the other operations mentioned above. This automatically 
excludes it from our list of elementary operations. 

Now that we have fmposed a structure on the memory of our standard com¬ 
puter by segmenting ft into words f we note that this structure Is not essen¬ 
tial to the theory we are developingl Recall that words which contained only 
one bit were said to be an acceptable special case; in this circumstance the 
only elementary operations are those which are normally implicit in register 
arithmetic: zeroing a bit, and, nor, or s not, move, and test if bit is one 
with branch. Using these we may segment core into finite length words and 
create '"macro-ope rat ions" which will emulate the elementary operations used 
for words, above* While it is true that It will now take much longer to 
multiply two words than to add than* the time differential Is constant, a 
function Of the word length which is very small in comparison to the total 
memory size. 


In this theory we are concerned almost entirely with the asymptotic 
behavior, up to a multiplication constant, of the time required by an algorithm 
e.g. the time required to multiply two r> x n matrices using the standard 
method In a general purpose computer is cn s . The constant may only be deter- 
mined when one Is confronted by a specific computer, and is best derived by 
a simple test; the time taken to multiply two n x n matrices where n is 
known quickly gives the value of c. The asymptotic, behavior, up to A con¬ 
stant of the time taken by an algorithm is adequately estimated by counting 
the number of elementary operations performed. The estimate is not sensitive 
to the difference between the usual machine and the one bit word machine 
which emulates It. Ms shall continue to speak of memory as consisting of 
words and we will use standard elementary operations, but we must bo clear that 
this is a convenience and not a basic assumption of the model. 

The following assumption, on the other hard, is entirely basic. In the 
operation of the standard computer we stipulate that the- elementary opera¬ 
tions must be applied serially, that is, in strict consecutive order. The 
necessity for this assumption for a model which purports to mirror the 
behavior of general purpose computers In common use is obvious, Drawbacks 
exist however. There are special purpose computers in existence (such as 

* 

array processors) whose performance Is not restricted by this assumption. 

From a standpoint of applications, the theory of algorithmic complexity 
cannot afford to Ignore such special purpose computers. From a more theo¬ 
retical standpoint, the Investigations of S, Ufnograd [2,3] which give 
■% 

However sequential operation Is canonical in the following sense: The num¬ 
ber of connections (from memory to accumulator) grows linearly with the size 
of memory if operation is sequential £ for parallel processors the number of 
connections must grow at a much greater rate. 



lower bounds on the times to perform register arithmetic would Fit in the 
framework of standard computers if parallel computation were permitted. We 
must* however, focus our attention on a computer with serial operation. It 
Is true that we are thereby Ignoring a byway of algorithmic complexity which 
permits parallel operation* but the two alternative assumptions are so basic 
and give rise to such different models that we feel any attempt to investigate 
both in the same work would cause unnecessary confusion and complication. 

We sum up the discussion above in a definitiont 
Ilefl nitlqn : A "standard computer" is a sequential machine with finite direct 
access binary memory. 

The assumption of finite memory was made implicitly when we spoke of 
finite word length which was "small in comparison to the total size of 
memory 1 [n actual practice* we shall assume that memory is large enough to 
hold the intermediate and final results of the algorithms we will be con¬ 
sidering; putting this another way* we shall have to guard against considering 
algorithms which require a ridiculously large amount of memory. We illustrate 
this by an example. 


Example I : We are given the task of multiplying two n x n ({Mi¬ 
ma trices. We may assume that the task will be performed a tremen¬ 
dous number of times* so that any amount of ''precalculation 1 ' which will 
simplify the task for a proposed algorithm Is justified. We proceed 
as follows: multiply all possible pairs of n x n (0.1)-ma¬ 
trices in the Standard way and store the results in memory. Now 
given any two matrices to multiply* we need merely ''look up" 
the product* calculating the address where the result was stored 
by any of a number of schemes based on the entries of the two 
matrices to be multiplied. This calculation only takes cn 2 
operations* which may he shown to be best possible from a stand¬ 
point of Information theory (more on this later). However we 
require rt ? £ ari words of memory to store all these matrices* 
which is clearly impractical for any large value of n + 


There can be no hard and fast rule by which one excludes algorithms 
from our theory because of memory considerations. A. Meyer and M. Fischer [4] 




use a variant on the above scheme to Improve the performance Of Strassen's 
method In the multiplication of (0*1)-matrices! however they do net assume 
that precalculation Is ''free* 1 * so their variant has intrinsic value. 

He note that a restriction was placed on word length; that it should 
bn small in comparison to the size of memory. We should at the same time 
keep some co*imon-sense absolute bound on the word length, we are trying here 
to abstract a model of general purpose computers which is independent of the 
length of a word in a particular case; the argument given above, that we 
may emulate a standard computer with words of constant length c using one 
bit word 5 , breaks down if c is allowed to grow without bound. The reason 
for this Is that the parallel processing of register arithmetic must be 
emulated by sequential comnands* and different elementary operations such 
as addition and multiplication, are Intrinsically different in their complexity 
of computation in a serial processor (see Cook, [5]), This fact also 
holds true for the parallel computation performed in registers [3,45* and 
the usefulness of the concept of elementary operations would bo destroyed 
If word size were allowed to grow without bound. 

The argument we have made in favor of restricting the size of words, that 
the parallel nature of computation In word arithmetic is poorly simulated by 
serial one bit word operations, may be viewed from a different direction. One 
of our most basic assumptions is the serial operation of a standard computer; 
permitting arbitrarily large bit strings to be operated on in parallel 
weakens this assumption. I am grateful to R. Berlekamp for the following 
illustrative example. 

Example £ ► We are given the task of counting the number of 1- 
ih a string of bits which we may assume fits in a word, w. 

Because of the seriality of computation* it seems intuitively 



obvius that we must view each bit position In the string* 
incrementing a counter when a bit turns out to be 1. This 
problem was given to machine language programmers on a 32- 
bit word machine. and almost every solution submitted was of 
this form, however, consider the following solution. 

We first define a number of constant "masks'': 

3, - 10101010....1010 
b{ * 01010101...0101 
= 11001 TOO.**1100 
b* " 001100110Q11.,.0011 

fs a concatenation of the string 

in.*. mm.* 


2 k-l ? k-l 

the number of times required until a 32-bit word 
constructed. The words hi. are defined as NQT(a^), 
bits are reversed. Clearly k <_ 5. 

following process is used: 

K + h the word whose 1 bits we wish to count, 
i - 0 
1 + 1+1 
y *■ x.and.af 
a ■*■ x.and.b 
i *■ 2**(i-l 
y + RIGHT SHIFT j{y) 
x * y + i 

if i.LT.5, go to proceed. 

It is left as a simple exercise that when the loop is exhausted* 
x is the number of 1-bits in the word w. Suppose that the word 
length were N = £*. Then the numher of l-bits in an K bit word 
would be calculated as in the process above in c log^ K opera¬ 
tions instead of the c‘N one would expect from a serial search. 

This example may seem startling, but is merely a demonstration of how 

the assumption Of serial computation may be circumvented if arbitrarily large 
% 

words are permitted. For simplicity, we shall assume that the word length 
is great enough so that* for the numbers met with In our tasks, we need take 
no precautions to guard against overflow. 


where a fc 


repeated 
has been 
i.e. all 

The 


Proceed 




13 Minimal Time for an Algorithm 


WO speak rather glibly of the time taken by an algorithm to perform a 
task* but there Is an unfortunate lack of precision in this concept* It 
springs from the fact that the prescription of a task Is vague: typically 
we are supplied with input In a certain format fe,g,: a list to be ordered) 
and the |C ta5k‘ l 1$ to process the input to arrive at a well-defined output 
(e.g** the ordered list). 

Given an algorithm to perform some task, together with a specific Input * 

the time taken by the algorithm may be arrived at empirically, however* given 

a different input we may find that the time taken to perform the same algorithm 

is different., Let us make ft clear that we are not making a trivial distlnctio 

e.g.: ft takes longer using most algorithms to order a list of tOflQ words 

than it does for a list of ID words* We may define the task more precisely 

by letting T n be the task of orderfug a list of n words. Indeed we shall 

usually have this dependence on a parameter n* However, even for tasks that 

are defined this explicitly* algorithms may vary in their execution time 

according to what input is given* this is a function of the implicit structure 

of the input* We Illustrate this idea by the following example* 

Example 1 : we are given a list of n Integers X[1)*X[2),...,X(n)* 

The task is to order this list* so that Ml) becomes the smallest 
element in the list, X(£) the second smallest* , Xfn) the 
largest, Ve do this using the algorithm INTERCHANGE SORT. 

PASS + 0 

NEHFASS PASS +- PASS + 1 
l + 0 

COUNT +■ 0 ***initialize for pass 

PROCEED *1+1 +1 

IF X(I+ 1 )*GT*X(I )90 to CHECK* else 
DUMMY + X(I) 

X[I) +- Xfl+T) **.perform interchange 
X( [+1) +- DUMMY 

COUNT ■ COUNT + 1 ***count interchange 
IF I*LT.n-PA$$,go to PROCEED, else 
IF COUNT.EE)*0, HALT, else 
IF PASS*LT*n,go to NEtfPASS, else 
HALT* 


CHECK 





This algorithm is in very common use for ordering short 
lists ^nd we shall not explain its operation. He do note 
that a count is kept on the number of interchanges in each pass* 
and the algorithm halts when a pass is completed with no inter¬ 
change made. This feature is used only occasionally, and is in 
fact inefficient for most applications. 

He now ask the anticipated question: how long does this algorithm take 
to perform Its task? Not surprisingly* the answer depends on the form of the 

input. Let us assume the array X(t),X(2)*...,X(n) is some permutation of the 

integers 1,£*,.**n* Now if the array X consists of the integers l,2»,..*n 
in reversed order, it should be clear that the maximum number* n, of passes 
will be made and further that the interchange of X(I) with X(I+1) will never 
be "skipped" by the IF statement which precedes it. The number of elementary 
operations In pass k Is given by the number of elementary operations between 
the statements labeled PROCEED and QhECK inclusive (1,e.* seven)* multiplied 
by the number of values i takes on in the k— pass (n-k). The remaining 

statements* being outside the central loop* may be ignored as Insignificant 

and we estimate the number of elementary operations performed by the algorithm 
with this Input as 

7[(n-l} + (n-2) + + 2 + 1] - - 1 * ■ * 

On the other hand* if the array X consists of the integers l,2,3,.,*»n 
in increasing order* then In the first pass no interchanges will be made, the 
count will remain zero* and we will halt after the first pass having performed 
3(n-l) operations In the central loop; we add 5 to this number to get the 
total number of operations performed. 

As we haye stated before* the tasks we are concerned with are "parametrize* 
by an integer n (multiplying nun matrices; evaluating an n-^- degree poly¬ 
nomial) and we study algorithms from a standpoint of the asymptotic behavior* 
up to a multiplicative constant, of the time they need for execution* INTER¬ 
CHANGE SORT is an example o-f what we tall a - h1tjh variance" algorithm: Its 



asymptotic behavior is not defined * for with different inputs it may take 
cn to c'n J elementary operations to execute. There are two approaches in 
general use to give some meaning to the concept of “the time taken" fay a high 
varIanqe algorithis. 

We first consider the technique of worst case analysis; whit Is the 
greatest length of time that an algorithm may take when supplied with any con¬ 
ceivable input of the right format. It is easily seen that the worst possible 
input for INTERCHANGE SORT is the list of integers in Inverse order* which we 
have analyzed, and the algorithm is said to take cn 2 steps. 

A second type of technique often employed to find the time taken by high 

variance algorithms is that of probabilistic analysis. Here we assume that 

all possible Inputs to an algorithm have a known probability* for each input 

we evaluate the time taken by the algorithm, t finputj* and we calculate the 

□ 

expected._tjM which the algorithm will take* E{t g }* Although this calculation 
may seem to be so incredibly difficult that it is not possible in any but the 
most elementary algorithms, this Impression is incorrect; a truly beautiful 
application of probabilistic analysis of this type is given in [7], which deals 
with searching in dynamically changing binary tree structured lists* In the 
particular case of INTERCHANGE SORT given above, assuming all permutations of 
the integers 1n are equally likely inputs, the expected time the algorithm 
will take* E[tJ> is cn 5 steps. (We do not prove this, hut it is not d1ff1- 
cult.) 

It is not at all common that worst case analysis and probabilistic analysis 
give the same asymptotic time estimate, as they do in the case of INTERCHANGE 
SORT* The expected time for a search in the binary tree structured list referred 
to in [7] is c log*, n, where n is the number of entries In the list; but a 
worst case analysis gives cn as the time taken by the search algorithm! Which 



estimate should one believe? 


Certainly both approaches have their merits and neither should be discarded 
In favor of the other. Qne of the advantages of studying the expected time 
to perform an algorithm of high variance type Is that it gives the researcher 
something to aim at: can an algorithm be found to perform the same task whose 
worst time is of the same order of magnitude as the expected time of the 
known Jnlgih variance algorithm? In the case of the binary tree structured list 
search this was donel It was achieved by Adel’son-Vel'skiy and Landis In 
Moscow and reported on with some improvements by C.C. Foster [8]. It is 
probably that the motivation was supplied by a probabilistic analysis carried 
out by P.F+ wtndley [9] who developed independently many of the same results 
as those in [7]. Without such an analysis, which showed the promise of binary 
tree structures for sorting and searching * the improved algorithm (with worst 
time for search c log 2 n) might not have been found. 

There is a school of thought which holds that time estimates derived by 
probabilistic analysis of algorithms serve no utilitarian purpose in application 
This attitude is justified in cases where the basic assumption of probabilistic 
analysts ts suspect, e.g T all inputs to an algorithm are not equally probable. 

If we may assign probabilities (even though not equalJ to the various pos- 
sfble Inputs* a probabilistic analysis is still feasible, however: the ex¬ 
pected time for the execution of the algorithm can sometimes be derived. 

A much mere basic objection exists to probabilistic analysis estimates, 

however* which Is quite difficult to answer. The objection is to a large extent 
■% 

subjective but may be expressed as follows: how can one trust an algorithm whic 
has good expected time for performance? What if, while performing a sequence 
of important calculations, a high variance algorithm with excellent expected 
performance continually receives input which shows it down to a performance 
time dose to what the worst case analysis would predict? Would it not be 





better to use an algorithm which has a better worst case time estimate? 

Certainly* In very special cases* such as real time control applications 
where time need not be optimized but only kept within an absolute bounds such 
caution is justified. But consider the hash coding algorithm as it is usually 
applied, where the hashing function is not calculated with a uarticular list 
of key-words in mind, A probabilistic analysis of this algorithm reveals that 
under most circumstances It rivals associative memory for speed of look-up" 
of a key-word. But a worst case analysis would consider only the case where 
all key-words are hashed to the same location, and the algorithm would be 
assigned the same execution time as linear search, cn operations. The AVL 
tree structure referred to in [B] would match any key-word in at most c tog n 
operations* as would logarithmic search (the Tatter algorithm however must 
be performed on an ordered list* which is not amenable to insertions and de¬ 
letions). It Is doubtful that many programmers would use logarithmic search 
in place of hash codling, in spite of the terrible performance of hash coding 
revealed by worst case analysis. 

So far in 53* we have been concerned mainly with "high variance algorithms" 

which we have treated by example. We should make this concept more concrete. 

Let a be an algorithm parametrized by an integer n, Denote the time taken by 

the algorithm a In processing some acceptable input* a, as Assume 

a 

there exists a function ffn) and two constants* and c 2 * not depending on n, 
such that: 

< tj(a) c 2 f (n) * (3 + l) 

for any acceptable Input a, Then the algorithm a is defined as a low variance 
algorithm;, we say that the time taken by the algorithm is Cjffn), Any algorithm 
which does not have this property is defined as a hjph variance algorithm. 






There are many examples of low variance algorltfims which come Ironed lately 
to mind, but the necessity for two constants* c 1 and c 2 to bound the time taken 
by an algorithm may not at first he obvious. It might seem that either art 
algorithm has high variance or else the time it takes to perform its task is 
independent of its input* To see that this is not so, one need merely consider 
the INTERCHANGE SORT algorithm of Example 1 with all instructions containing 
the variable COUNT removed. Then the number of elementary operations required 
by this algorithm to sort the array (n,n-l»n*2». ,.*3*2,1 ) Is about 3 n a * but 
to sort the array 0 *2*3,****rt-l ,n) t it is about |n 2 . 

This is one of the major reasons that In speaking of the time taken by 
an algorithm* me are concerned kith asymptotic behavior only up to a constant of 
multiplication! Another reason which has already been mentioned Is that we 
wish our analysis to have as much generality as possible. Our basic assumption 
that all elementary operations take the same length of time is necessary for 
any kind of general theory* but It Is not precfsej it fs true only up to a 
multiplicative constant. 


§4 Toward a More Rigorous Theory 

In this section, we have tried to develop a model, called the standard 
computer* which abstracts the essentia] features of general purpose computers 
in most efflimon use today. By 11 essential" of course, we mean essential to our 
purpose of estimating the time taken by algorithus In Important applications. 
Hopefully, U Is clear to the reader that this model has been very carefully 
constructed; at each pofnt that a difference of opinion might arise, as in 
the question of whether memory could be taken to be infinite, we have tried 
to show by example why one course Is better than another,. Ue could go on at 
great length in this vein, as the most severe limitation we feel In this writing 
is the necessity of limiting the number of relevant examples so as not to 
lose the Impetus of our presentation in a welter of detail. 

The standard computer we have outlined seems to lack the most Important 
feature of a model, that of simplicity. This Ignores, however, what is probably 
the most far-reaching conclusion of this section: that a one-bit word standard 
computer with register arithmetic Instructions, may emulate (and be emulated 
by) a standard computer with thirty or forty bit words and an extensive in¬ 
struction set; and this without changing the asymptotic time to perform 
algorithms except for the constant of multiplication. A one-bit word standard 
computer may be defined about as simply as a Turing machine, and its programming 
language should be hardly more difficult. Furthermore* in the next section we 
shall present one of the basic approaches to proofs In computational complexity: 
the use of Information theory to bound below the number of ves-no Questions 
answered by any algorithm which performs a given task. This puts a measure of 
complexity on the task and requires at least as many elementary operations to 
perform the task as there are yes-no questions answered. One of the Mn- 




dranees to this approach has been the fact that many elementary operations are 
hard to pot into the framework of information theory: there seem to be no 
yes-no questions answered In their performance. In the more basic framework 
of one-bit word standard computers, it becomes apparent that operations such 
as multiplication and addition dp indeed require test and branch instructions 
[yes-no question), In fact, we may formulate the instruction set of the one- 
bit word computer so that the instructions "and' 1 * "or", "nor", etc. are emu¬ 
lated by two test and branch Instructions, Thus the only elementary opera¬ 
tions needed for 8 one-bit word standard computer are: set bit zero, Set bit 
one, move data, and test if bit is one with branch- The question of address 
arithmetic requires some thought. 

We do not try to develop a theory of one-bit word standard computers In 
this wcrki It is felt that such an undertaking would be premature since our 
model Is new and may require revision, Some time for consolidation and possible 
correction of these concepts should be allotted (and here we took to readers for 
suggestions) before aqy attempt is made to codify them. 

Although we shall deal mainly with many-bit word standard computers In 
the following sections, we feet that the future of algorithmic complexity lies 
in the study of how tasks may be performed in seme basic, elemental environment 
such as we have outlined. Research is presently being jointly undertaken by 
L.H, Harper (U.C, at Riverside) and O.E. Savage [Brown) based on the work of 
SubbotQVSkoya [10] and Ne^iporuk [11]- This work studies the necessary 

length of the Boolean representation of specific functions; it is assumed that 

% 

the length Is proportional to the time needed for evaluation, A possible 
drawback to this formulation Is that branching is not permitted within the 
Boolean representation. This objection may be surmountable, however, and 
we hold out great hope for this approach. 
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